skip to main content


Search for: All records

Creators/Authors contains: "Boluki, Shahin"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Designing and/or controlling complex systems in science and engineering relies on appropriate mathematical modeling of systems dynamics. Classical differential equation based solutions in applied and computational mathematics are often computationally demanding. Recently, the connection between reduced-order models of high-dimensional differential equation systems and surrogate machine learning models has been explored. However, the focus of both existing reduced-order and machine learning models for complex systems has been how to best approximate the high fidelity model of choice. Due to high complexity and often limited training data to derive reduced-order or machine learning surrogate models, it is critical for derived reduced-order models to have reliable uncertainty quantification at the same time. In this paper, we propose such a novel framework of Bayesian reduced-order models naturally equipped with uncertainty quantification as it learns the distributions of the parameters of the reduced-order models instead of their point estimates. In particular, we develop learnable Bayesian proper orthogonal decomposition (BayPOD) that learns the distributions of both the POD projection bases and the mapping from the system input parameters to the projected scores/coefficients so that the learned BayPOD can help predict high-dimensional systems dynamics/fields as quantities of interest in different setups with reliable uncertainty estimates. The developed learnable BayPOD inherits the capability of embedding physics constraints when learning the POD-based surrogate reduced-order models, a desirable feature when studying complex systems in science and engineering applications where the available training data are limited. Furthermore, the proposed BayPOD method is an end-to-end solution, which unlike other surrogate-based methods, does not require separate POD and machine learning steps. The results from a real-world case study of the pressure field around an airfoil. 
    more » « less
  2. Gorodkin, Jan (Ed.)
    Abstract Motivation When learning to subtype complex disease based on next-generation sequencing data, the amount of available data is often limited. Recent works have tried to leverage data from other domains to design better predictors in the target domain of interest with varying degrees of success. But they are either limited to the cases requiring the outcome label correspondence across domains or cannot leverage the label information at all. Moreover, the existing methods cannot usually benefit from other information available a priori such as gene interaction networks. Results In this article, we develop a generative optimal Bayesian supervised domain adaptation (OBSDA) model that can integrate RNA sequencing (RNA-Seq) data from different domains along with their labels for improving prediction accuracy in the target domain. Our model can be applied in cases where different domains share the same labels or have different ones. OBSDA is based on a hierarchical Bayesian negative binomial model with parameter factorization, for which the optimal predictor can be derived by marginalization of likelihood over the posterior of the parameters. We first provide an efficient Gibbs sampler for parameter inference in OBSDA. Then, we leverage the gene-gene network prior information and construct an informed and flexible variational family to infer the posterior distributions of model parameters. Comprehensive experiments on real-world RNA-Seq data demonstrate the superior performance of OBSDA, in terms of accuracy in identifying cancer subtypes by utilizing data from different domains. Moreover, we show that by taking advantage of the prior network information we can further improve the performance. Availability and implementation The source code for implementations of OBSDA and SI-OBSDA are available at the following link. https://github.com/SHBLK/BSDA. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  3. We propose a new model for supervised learning to rank. In our model, the relevance labels are assumed to follow a categorical distribution whose probabilities are constructed based on a scoring function. We optimize the training objective with respect to the multivariate categorical variables with an unbiased and low-variance gradient estimator. Learning-to-rank methods can generally be categorized into pointwise, pairwise, and listwise approaches. Although our scoring function is pointwise, the proposed framework permits flexibility over the choice of the loss function. In our new model, the loss function need not be differentiable and can either be pointwise or listwise. Our proposed method achieves better or comparable results on two datasets compared with existing pairwise and listwise methods. 
    more » « less
  4. null (Ed.)
    We propose a unified framework for adap- tive connection sampling in graph neural net- works (GNNs) that generalizes existing stochas- tic regularization methods for training GNNs. The proposed framework not only alleviates over- smoothing and over-fitting tendencies of deep GNNs, but also enables learning with uncertainty in graph analytic tasks with GNNs. Instead of using fixed sampling rates or hand-tuning them as model hyperparameters as in existing stochas- tic regularization methods, our adaptive connec- tion sampling can be trained jointly with GNN model parameters in both global and local fash- ions. GNN training with adaptive connection sampling is shown to be mathematically equiv- alent to an efficient approximation of training Bayesian GNNs. Experimental results with abla- tion studies on benchmark datasets validate that adaptively learning the sampling rate given graph training data is the key to boosting the perfor- mance of GNNs in semi-supervised node classifi- cation, making them less prone to over-smoothing and over-fitting with more robust prediction. 
    more » « less